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Abstract 

We study the optimal control of storage which is used for both arbitrage and 
buffering against unexpected events, with particular applications to the control of 
energy systems in a stochastic and typically time-heterogeneous environment. Our 
philosophy is that of viewing the problem as being formally one of stochastic dynamic 
programming, but of using coupling arguments to provide good estimates of the costs 
of failing to provide necessary levels of buffering. The problem of control then reduces 
to that of the solution, dynamically in time, of a deterministic optimisation problem 
which must be periodically re-solved. We show that the optimal control then proceeds 
locally in time, in the sense that the optimal decision at each time t depends only on 
a knowledge of the future costs and stochastic evolution of the system for a time 
horizon which typically extends only a little way beyond t. The approach is thus both 
computationally tractable and suitable for the management of systems over indefinitely 
extended periods of time. We develop also the associated strong Lagrangian theory 
(which may be used to assist in the optimal dimensioning of storage), and we provide 
characterisations of optimal control policies. We give examples based on Great Britain 
electricity price data. 


1 Introduction 

The control of complex stochastic systems, for example modern power networks which 
must cope with many sources of uncertainty in both generation and demand, requires 
real-time optimisation of decision problems which are often computationally intractable— 
notably so in a time-heterogeneous environment. This clearly also poses difficulties for 
the design of such systems. As in the case of the well studied areas of communication and 
manufacturing networks, our belief is that what is required is the careful specification of 
the stochastic models governing the behaviour of such systems, coupled with the analytical 
derivation of accurate approximation techniques. 

In the present paper we use an economic framework to consider the optimal control of 
a single storage facility. The problem is made interesting because, at least in power 
networks, storage may be simultaneously used for many different purposes, with potentially 
conflicting objective functions. However, if storage is to be economically viable, it must 
be capable of meeting these competing objectives. We concentrate on energy storage in 
a time-heterogeneous environment, and consider two of the main uses of such storage 
systems; (a) price arbitrage, i.e. the buying and selling of energy over time (whether to 
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earn revenue for the store owner or for the benefit of the consumer), and (b) the provision 
of buffering services, so as to react rapidly to sudden and unexpected changes, for example 
the loss of a generator or transmission line, or a sudden surge in demand. Our general 
approach is likely to be applicable to other uses of storage, and also to the optimal control 
of other facilities used for the provision of multiple services. 

There is considerable literature on the control of storage for each of the above two purposes 
considered on its own. In the case of the use of storage for arbitrage, and with linear cost 
functions for buying and selling at each instant in time, the problem of optimal control 
is the classical warehouse problem (see [1, 2, 3] and also [4] for a more recent example). 
Cruise et al [5] consider the optimal control of storage—in both a deterministic and a 
stochastic setting—in the case where the store is a price maker (i.e. the size of the store is 
sufficiently large that its activities influence prices in the market in which it operates) and 
is subject to both capacity and rate constraints; they develop the associated Lagrangian 
theory, and further show that the optimal control at any point in time usually depends 
only on the cost functions associated with a short future time horizon. Recent alternative 
approaches for studying the value and use of storage for arbitrage can be found in the 
papers [6, 7, 8, 9, 10]— see also the text [11], and the further references given in [5]. For an 
assessment of the potential value of energy storage in the UK electricity system see [12]. 

There have been numerous studies into the use of storage for buffering against both the 
increased variability and the increased uncertainty in electrical power systems, due to 
higher penetration of renewable penetration—the former due to the natural variability of 
such resources as wind power, and the latter due to the inherent uncertainty of forecasting. 
These studies have considered many different more detailed objectives; these range from 
the sizing and control of storage facilities co-located with the renewable generation so as 
to provide a smoother supply and so offset the need for network reinforcement [13, 14, 15], 
to studies on storage embedded within transmission networks so as to increase wind power 
utilisation and so reduce overall generation costs [16, 17, 18]. In addition there have been 
a number of studies into the more general use of storage for buffering, for example, so as 
to provide fast frequency response to power networks [19, 20, 12], or to provide quality of 
service as part of a microgrid [21, 22]. 

In general the problem of using a store for buffering is necessarily stochastic. The natural 
mathematical approach is via stochastic dynamic programming. This, however, is liable to 
be computationally intractable, especially in the case of long time horizons and the likely 
time heterogeneity of the stochastic processes involved. Therefore much of the literature 
considers necessarily somewhat heuristic but nevertheless plausible control policies—again 
often adapted to meeting a wide variety of objectives. For example, for storage embedded 
in a distribution network, two control policies are considered in [23]; the first policy aims 
to feed into a store only when necessary to keep local voltage levels within a predefined 
range and to empty the store again as soon as possible thereafter; the second policy aims 
to maintain a constant level of load in the network. For larger stores operating within 
transmission networks, the buffering policies studied have included that of a fixed target 
level policy [24], a dynamic target level policy [25], and a two stage process with day ahead 
generation scheduling and a online procedure to adapt load levels [26] . 

Control policies have been studied via a range analytic and simulation based methods. 
Examples of an analytic approach can be found in [27], where partial differential equa¬ 
tions are utilised to model the behaviour and control of a store, and in [23, 28], where 
spectral analysis of wind and load data is used with models which also incorporate tur- 
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bine behaviour. Simulation-based studies include [24, 25], which use a bootstrap approach 
based on real wind forecast error data, and [26], which uses Monte Carlo simulation of the 
network state. 

In the present paper we study the optimal control of a store which is used both for arbi¬ 
trage and for buffering against unpredictable events. As previously indicated we use an 
economic framework, so that the store sees costs (positive or negative) associated with 
buying and selling, and with the provision of buffering services. The store seeks to oper¬ 
ate in such a way as to minimise over time the sum of these costs. We believe such an 
economic framework to be natural when the store operates as part of some larger and per¬ 
haps very complex system, provided the price signals under which the store operates are 
correctly chosen. The store may be sufficiently large as to have market impact, leading to 
nonlinear cost functions for buying and selling, may be subject to rate (as well as capacity) 
constraints, and, as will typically be the case, may suffer from round-trip inefficiencies. 
We formulate a stochastic model which is realistic in many circumstances and characterise 
some of the properties of an optimal control, relating the results to the existing experi¬ 
mental literature. We develop the associated strong Lagrangian theory and, by making 
a modest approximation—the validity of which may be tested in practical applications— 
show how to construct a computationally tractable optimal control. These latter results 
form a nontrivial extension of those of the “arbitrage-only” case studied in [5] , and require 
significant new developments of the necessary optimization theory; as in [5], the optimal 
control at any time usually depends on a relatively short time horizon (though one which 
is typically somewhat longer than in the earlier case), so that the algorithm is suitable for 
the optimal control of the store over an indefinite period of time. 

The optimal control is given by the solution, at the start of the control period, of a 
deterministic optimisation problem which can be regarded as that of minimising the costs 
associated with the store buying and selling added to those of notionally “insuring” for 
each future instant in time against effects of the random ffuctuations resulting from the 
provision of buffering services. The cost of such “insurance” depends on the absolute level 
of the store at that time. Thus this deterministic problem is that of choosing the vector of 
successive levels of the store so as to minimise a cost function + ^*('5*)]; subject 

to rate and capacity constraints, where Ct{xt) is the cost of incrementing the level of the 
store (positively or negatively) at time t by xt, and the “penalty” function At is such that 
At{st) is the expected cost of any failure to provide the required buffering services at the 
time t when the level of the store is then st- We define this optimisation problem P more 
carefully in Sections 2 and 5. In the stochastic environment in which the store operates, 
the solution of this deterministic problem determines the future control of the store until 
such time as its buffering services are actually required, following which the level of the 
store is perturbed and the optimisation problem must be re-solved starting at the new 
level. The continuation of this process provides what is in principle the exactly optimal 
stochastic control of the store on a potentially indefinite time scale. 

In Section 2 we formulate the relevant stochastic model and discuss its applicability. This 
enables us, in Section 3 to provide some characteristic properties of optimal solutions, 
which we relate to empirical work in the existing literature. In Sections 4 and 5 we develop 
the approach to an optimal control outlined above. Section 6 considers the deterministic 
optimisation problem associated with the stochastic control problem and derives the as¬ 
sociated strong Lagrangian theory, while in Section 7 we develop an efficient algorithm. 
Section 8 gives examples. 
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2 Problem formulation 


Consider the management of a store over a finite time interval [0, T] where the time horizon 
T is integer, and where [0, T] is divided into a succession of periods t = 1,..., T of integer 
length. At the start of each time period t the store makes a decision as to how much to 
buy or sell during that time period; however, the level of the store at the end of that time 
period may be different from that planned if, during the course of the period, the store is 
called upon to provide buffering services to deal with some unexpected problem or shock. 
Such a shock might be the need to supply additional energy during the time period t 
due to some unexpected failure—for example that of a generator—or might simply be the 
difference between forecast and actual renewable generation or demand. We suppose that 
the capacity of the store during the time period t is Et units of energy. (Usually Et will 
be constant over time, but need not be, and there are some advantages—see in particular 
Section 4 —in allowing the time dependence.) Similarly we suppose that the total energy 
which may be input or output during the time period t is subject to rate (i.e. power) 
constraints Pjt and Pot respectively. This slotted-time model corresponds, for example, 
to real world energy markets where energy is typically traded at half-hourly or hourly 
intervals, with the actual delivery of that energy occurring in the intervening continuous 
time periods. Detailed descriptions of the operation of the UK market can be found in 
[29, 30]. 

For each t let Xt = {x : —Pot < x < Pn}- Both buying and selling prices associated 
with any time period t may be represented by a convex function Ct defined on Xt which 
is such that, for positive x, Ct{x) is the price of buying x units of energy for delivery 
during the time period t, while, for negative x, Ct{x) is the negative of the price for selling 
—X units of energy during that time period. Thus, in either case, Ct{x) is the cost of a 
planned change of x to the level of the store during the time period t, in the absence of 
any buffering services being required during the course of that time period. The convexity 
assumption corresponds, for each time t, to an increasing cost to the store of buying each 
additional unit, a decreasing revenue obtained for selling each additional unit, and every 
unit buying price being at least as great as every unit selling price. When, as is usually 
the case, the store is not perfectly efficient in the sense that only a fraction r/ < 1 of 
the energy input in available for output, then this may be captured in the cost function 
by reducing selling prices by the factor ry; under the additional assumption that the cost 
functions Ct are increasing it is easily verified that this adjustment preserves the above 
convexity of the functions Ct- We thus assume that the cost functions are so adjusted so 
as to capture any such round-trip inefficiency. 

Remark 1. A further form of possible inefficiency of a store is leakage, whereby a fraction 
of the contents of the store is lost in each unit of time. We do not explicitly model this 
here. However, only routine modifications are required to do so, and are entirely analogous 
to those described in [5]. 

Remark 2. Note also that, in the above model, it is possible to absorb the rate constraints 
into the cost functions—by setting the costs associated with x ^ Xt to be prohibitively 
high—and to preserve the convexity of these functions. However, in general we prefer to 
avoid this approach here. 

Suppose that at the end of the time period t — 1, or equivalently at the start of the time 
period t, the level of the store is S 4 _i (where we take sq to be the initial level of the 
store). We assume that one may then choose a planned adjustment (contract to buy or 
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sell) xt G Xt —and such that additionally st-i + xt G [0, Et ]—to the level of the store 
during the time period t, the cost of this adjustment being Ct{xt). Subsequent to this, 
during the course of the time period t, the the store may subject to some shock or random 
disturbance, corresponding perhaps to the need to provide unexpected buffering services, 
which may both disturb the final level of the store at the end of that time period—and 
perhaps also at the end of subsequent time periods—and have further associated costs, 
the latter being typically those of the store not being able to provide the required services. 
For each t, and for each possible level st_i of the store at the end of the time period t — 1, 
define to be the expected future cost of subsequently managing the store under 

an optimal strategy (i.e. one under which this expected cost is minimised), under the 
assumption that either no shocks have occurred by the end of the time period t — 1 
or that, given the level st-i, such past shocks as have occurred by that time do not 
influence the optimal future management of the store or its associated costs. Under these 
conditions, and for a planned adjustment xt to the level of the store during the time 
period t (at an immediate cost Ct{xt) as indicated above), in the absence of any shock 
during the time period t, the expected cost of optimally managing the store thereafter 
is then Vt{st-i + xt). We assume that the expected additional cost to the store, both 
immediate and future, of dealing optimally with any shock which may occur during the 
time period t is a function At{st-i + xt) of the planned level st-i + xt of the store for the 
end of the time period t. We then have that 

Vt-i{st-i) = min [Ct{xt) + At{st-i + xt) + Vt{st-i + xt)] , (1) 

xt&Xt 

Sf—i+3;tGn[0,-Et] 

and that the optimal planned increment to the level of the store for the time period t 
(given that an optimal policy is to be followed thereafter) is given by xt{st-i) where this 
is defined to be the value of xt G Xt which achieves the minimisation in the recursion (1). 
We also define the terminal condition 


Vt{st) = 0 ( 2 ) 

for all possible levels st of the store at the end of the time period T. 

Note that At{st-i + xt) (which may be alternatively be interpreted as the “insurance” 
cost associated with the planed level of the store for the time period t as described in 
the Introduction) may be understood via a coupling argument, in which the possibly 
disturbed and subsequently optimally controlled process of store levels—following any 
shock in the time period t —is coupled to the process which is undisturbed in that time 
period and subsequently optimally controlled; At{st-i + xt) is then the expected difference 
in the costs of operating the two processes until such time (if ever) as they subsequently 
merge. As we discuss further below, this interpretation proves useful in finding workable 
approximations to the functions At- 

Remark 3. We make the assumption above that each function At, representing the extra 
cost of dealing with a shock occurring during the time period t, may be represented as a 
function of the planned level st-i+xt of the store for the end of that time period and, given 
this, does not further depend on the level of the store at the beginning of that time 
period. The accuracy of this assumption will vary according to the precise characteristics of 
the store, the way in which it interacts with its external environment in the event of shocks, 
and the various cost functions which form part of the model. The assumption is likely to 
be at its most accurate when rate constraints do not play a major role in the management 
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of the store, as the store may adjust to its target levels quickly. Elsewhere, when the level 
of the store does not change too much during a single time period, the assumption may 
still be regarded as a reasonable approximation. Its relaxation—for example by allowing 
At to be a more general function of st-\ and xt —simply complicates without essentially 
changing the analysis below. 

Our aim is to determine the optimal control of the store over the time interval [0,r]. 
Such a control will necessarily be stochastic. In principle some form of stochastic dynamic 
programming approach is required. However, particularly within a time heterogeneous 
environment (in which there may be no form of regularity in either the functions Ct or in 
the shock processes), such an approach would be unlikely to be efficient, and might well 
prove computationally intractable, on account of (a) the need, in such an approach, to 
completely determine each of the functions V) defined above, and (b) the need to solve 
the problem over the entire time interval [t,T] in order to determine the optimal control 
at any time t. 

Our method of proceeding is therefore as follows. We assume that the functions At are 
known, at least to within reasonable approximations. (We argue in Section 4 that in 
many cases the functions At may be determined either exactly or to within a very good 
approximation; this follows from the coupling characterisation of these functions intro¬ 
duced above.) Given the initial level so of the store we may then use the argument leading 
to the recursion (1) and (2) to determine very efficiently a control which remains optimal 
up to the end of the first time period in which a shock actually occurs. Following such a 
shock (and, if necessary, once its knock-on effects have cleared from the system—again see 
the discussion of Section 4), the current state (level) of the store is reexamined and the 
optimal future control strategy recalculated. Iteration of this process leads to an efficient 
(stochastic) dynamic control for the entire time interval [0,T]. We also show below that 
typically the optimal decision at (the start of) any time t depends only on the functions 
Cf and At’ for values of time t' extending only a little beyond the time t. The approach 
outlined above is therefore generally also suitable for the ongoing optimal management of 
the store over an indefinite period of time. 


3 Characterisation of optimal solutions 


In this section we establish some properties of the functions xt{-) defined in the previous 
section and determining the optimal control of the store. 


One case which will be of particular interest is that where the store is a price taker (i.e. 
the store is not so large as to impact itself on market prices), so that, for each t, the cost 
function Ct is given by 


Ct{x) 



if X > 0 
if X < 0. 


( 3 ) 


and where the unit “buying” price and the unit “selling” price are such that 
(That, at any time t, the reward obtained in the market resulting from 
decreasing the level of the store by a single unit may be less than the cost of increasing 
the level of the store by a single unit may primarily reflect the fact that the store may be 
less than perfectly efficient—see the discussion of Section 2.) 

Proposition 1 below is a very simple result which shows that in the case where buying 
and selling prices are equal (typically corresponding to a perfectly efficient store), and 
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provided rate constraints are nonbinding, the optimal policy is a “target” one. By this we 
mean that for each time period t there exists a target level st- given that the level of the 
store at the end of the immediately preceding time period is st-i and that shocks prior 
to that time have no further ongoing effects on the management of the store, the optimal 
planned level st_i + xt of the store to be achieved during the time period t is set to some 
value St, independently of st_i. 

Proposition 1. Suppose that, for each t, we have = ct say; define 

St = argmin[cts + Afis) + Vfis)]. (4) 

sG [0,£'t] 

Then, for each t and for each st-i, we have xfist-i) = st — st-i provided only that this 
quantity belongs to the set Xt. 

Proof. The recursion (1) here becomes, for each t, 

Vt-i{st-i) = min [axt + Afist-i + xt) + Vfist-i + xt)] , (5) 

xt€Xt 

st--i+xt&r\\0,Et] 

and the above minimisation is achieved by xt such that st-i +xt = st, provided only that 

Xt G Xt. □ 

In order to deal with the possibility of rate constraint violation, with the more gen¬ 
eral price-taker case where < c^\ and with the quite general case where the cost 
functions Ct are merely required to be convex, we require the additional assumption of 
convexity of the functions At. This latter condition, while not automatic, is reasonably 
natural in many applications—see the examples of Section 8. 

Proposition 2. Suppose that, in addition to convexity of the functions Ct, each of the 
functions At is convex. Then, for each t: 

(i) the function Vt-i is convex; 

(a) xt{st-i) is a decreasing function of st-i; 

(Hi) st-i + xt{st-i) is an increasing function of st-i. 

Proof. It is helpful to define, for each t = 1,...,T, the function Ut-i of each possible 
level of the store at the end of the time period t — 1, and each possible planned 
increment xt to the level of the store for the time period t by 


Ut-i{st-i, Xt) = Ct{xt) + At{st-i Xt) + Vt{st-i + Xt). (6) 

The recursion (1) now becomes 

Vt-i{st-i) = min Ut-i{st-i, xt). (7) 

xteXt 

St—i-i-xtGn[0,Et] 

To show (i) we use backwards induction in time. The function Vt is convex. Suppose that, 

for any given t < T, the function Vt is convex; we show that the function Vt-i is convex. 

(b (i) 

For each of given values slf^, i = 1,..., n of st_i, let x) ^ be the value of xt which achieves 
the minimisation in (7), and for any convex combination st-i = Yli=i i^isfli, where each 
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Ki > 0 and where XlILi ~ define also xt = Y17=i ■ Note that xt £ Xt and that 

st-i +xt G [0,Et]. Then, from (7), 


l('St—l) ^ l('St—1, Xt) 

n 

i=l 

n 

i=l 

where the second line in the above display follows from the definition (6) of the func¬ 
tion f7t_i and the convexity of the functions Ct, At and Vt (the latter by the inductive 
hypothesis). Thus V)_i is convex as required. 

To show (ii) and (hi), given values of st-i, again let xf''^ be the respective 

values of xt which achieves the minimisation in (7). Since for the function Ut-i{s[^i, •) 
is minimised in Xt H Et at x[^\ it follows straightforwardly, from the definition (6) of the 
function Ut-i and the convexity of the function Ct and that of the function At + Vt, that, 
since the minimisation of the function Ut-i{s[^^, •) is achieved (or, in the case 

of nonuniqueness, may be achieved) at < x[^\ Thus the result (ii) follows. Similarly, 
it is again straightforward from the convexity of the function Ct and that of the function 
At + Vt and since that x^^ is (or, in the case of nonuniqueness, may be taken 

to be) such that + x|^^ > -|- x^^ \ The result (iii) thus similarly follows. □ 


Remark 4. Note that the rate constraints xt £ Xt for all t cause no difficulties for the 
above proof—a result which may alternatively be seen by absorbing these constraints into 
the cost functions Ct as described in Remark 2. 

We now return to the price-taker case, in which the cost functions are as defined by (3), 
and which corresponds to a store which is not sufficiently large as to have market impact. 
Here we may prove a strengthened version of Proposition 2. For each t, given that the 
function At is convex, dehne 

= argmin[cf^s At{s) + Vt{s)] (8) 

se[o,Et] 


and similarly define 

= argmin[c|^^s -h At{s) + Vt{s)]. (9) 

sG [0,£'t] 

Note that the above convexity assumption and the condition that, for each t, we have 
imply that We now have the following result. 


Proposition 3. Suppose that the cost functions Ct are as given by (3) and that the 
functions At are convex. Then the optimal policy is given by: for each t and given st-i, 
take 


Xt 


min(sf^ - st-i, Pr) 

< 0 

_max(4*^ - st-i, -Pot) 


if st-i < 

< st-i < 4*^ 
if st-i > 


( 10 ) 


Proof. For each t, it follows from the convexity of the functions Ct, At and Vt (the latter 
by the first part of Proposition 2) that, for st-i < sf’'^ the function Ct{xt) + At{st-i + 



xt) + Vt{st-i + xt) is minimised by xt = — st-i, for sf'^ < st-i < it is minimised 

(s) (5) 

by Xt = 0, while for st_i > , it is minimised hy xt = si — st-i- The required result 

now follows from the recursion (1). □ 

Thus in general in the price-taker case there exists, for each time period t, a “target 
interval” such that, if the level of the store at the end of the previous time 

period is (and again given that the shocks prior to this time have no ongoing effects 
on the optimal management of the store), the optimal policy is to chose xt so that st-i + xt 
is the nearest point (in absolute distance) to st_i lying within, or as close as possible to, 
the above interval. In the case where = ct, the above interval shrinks to the 

single point st defined by (4). 

These results shed some light on earlier, more applied, papers of Bejan et al [24] and Cast 
et al [25] , in which the uncertainties in the operation of a energy store result from errors in 
wind power forecasts. The model considered in those papers is close to that of the present 
paper, as we now describe. The costs of operating the store result (a) from round-trip 
inefficiency, which in the formulation of the present paper would be captured by the cost 
functions Ct as defined by (3) with Ct the same for all t, and (b) from buffering events, i.e. 
from failures to meet demand through insufficient energy available to be supplied from the 
store when it is needed, and from energy losses through store overflows. In the formulation 
of the present paper these costs would be captured by the functions At. In contrast to 
the present paper decisions affecting the level of the store (the amount of conventional 
generation to schedule for a particular time) are made n time steps—rather than a single 
time step—in advance. The shocks to the system result from the differences between 
the available wind power as forecasted n steps ahead of real time (when conventional 
generation is scheduled) and the wind power actually obtained. Although the model of 
the above papers is therefore not exactly the same as that of the present paper, the 
underlying arguments leading to Propositions 1-3 continue to apply, at least to a good 
approximation. In particular sample path arguments suggest that the reduction of round- 
trip efficiency slows the rate at which the store-level trajectories—started from different 
initial levels but with the same stochastic description of future shock processes—converge 
over subsequent time. 

Bejan et al [24] consider only the case where the round-trip efficiency is 1. They study 
the efficiency of policies—analogous to those suggested by Proposition 1 —whereby, for 
each t, the generation scheduled for time t at the earlier time t — n is such as would, given 
perfect forecasting, achieve a given target level st of the store at time t; this target level 
is independent of the level st-n of the store at the time t — n and of earlier scheduling 
decisions. However, Bejan et al [24] further take st to be independent of t, something which 
may not be optimal given the likely nonstationarity of the process of forecast errors. 

Cast et al [25] subsequently study the same time series of available wind power, but allowed 
for round-trip efficiencies which are less than 1. They find (as might be expected here) 
that simple “target” policies such as that described above do not work well under these 
circumstances, and compare the behaviour of a variety of time-homogeneous policies. 

4 Determination of the functions At 

We described in Section 2 how, given a knowledge of the functions At, the optimal control of 
the store could be determined. In Sections 5-7 we develop such an approach, which is based 
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on strong Lagrangian theory and which is very much more efficient, in senses explained 
there, than the application of standard dynamic programming or nonlinear optimisation 
techniques. In this section we consider conditions under which the functions At may be 
thus known, either exactly or to good approximations. 

Suppose that, as in Section 2, at the end of the time period t — 1 the level of the store 
is Si_i and that, given st_i, any shocks prior to that time have no further effect on the 
optimal management of the store. Suppose further that an increase of xt (positive or 
negative) is planned for the time period t (at a cost of Ct{xt)). Recall that At{st-i + xt) is 
then defined to be the expected additional cost to the store of dealing optimally with any 
shock which may occur during the time period t, and may be conveniently characterised 
in terms of the coupling defined in that Section 2. Now define also At{st-i +xt) to be the 
expected additional cost to the store of dealing with any shock which may occur during 
the time period t and immediately returning the level of the store to its planned level 
st-i + xt at the end of the time period t. As in the case of the function At, we assume that 
each function At depends on st_i and xt through their sum st_i + xt —the extent to which 
this approximation is reasonable being as discussed for the functions At. Given the costs 
of dealing with any shocks, and the known costs of making any immediate subsequent 
adjustments to the level of the store, the functions At are readily determinable, and in 
particular do not depend on how the store is controlled outside the time period t. 

Note that, in the case of linear cost functions (i.e. Ct{x) = ctx for all t) and when shocks 
do not have effects which persist beyond the end of the time period in which they occur, 
the argument of Proposition 1 implies immediately that At = At for all t: the linearity of 
Ct implies that, at the end of the time period t — 1 and when the level of the store is then 
st-i, if st_i + Xt is the optimal planned level of the store for the end of the time period t, 
then it remains the optimal level of the store for the end of that time period following any 
shock which occurs during it. 

More generally the functions At provide reasonable approximations to the functions At to 
the extent to which it is reasonable, following any shock with which the store is required 
to deal, to return immediately the level of the store to that which would have obtained in 
the absence of the shock. In particular, when shocks are relatively rare but are potentially 
expensive (as might be the case when the store is required to pay the costs of failing to 
have sufficient energy to deal with an emergency), then the major contribution to both 
the functions At and At will be this cost, regardless of precisely how the level of the store 
is adjusted in the immediate aftermath of the shock. 

If necessary, better approximations to the functions At may be obtained by allowing longer 
periods of time in which to optimally couple the trajectory of the store level, following a 
shock, to that which would have obtained in its absence. In applications one would wish 
to experiment a little here. 

In applications there is also a need, when the costs of a shock arise from a failure to have 
insufficient energy in the store to deal with it, to identify what these costs are. There are 
various possible candidates. Two simple such—natural in the context of risk metrics for 
power systems, where they correspond respectively to loss of load and energy unserved 
(see, for example, [31])— are: 

(i) for each t > 0, the cost of a shock occurring during the time period t is simply some 
constant > 0 if there is insufficient energy within the store to meet it, and is 
otherwise 0. 

(ii) for each t > 0, the cost of a shock occurring during the time period t is proportional 
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to the shortfall in the energy necessary to meet that shock. 

Given the planned level + xt of the store to be achieved during any time period t, 
the total additional cost of dealing with any shock occurring during that time period (as 
defined for example in terms of the coupling introduced in Section 2) is a random variable 
which is a function of the size of the shock. The distribution of this random variable, and 
so also its expectation + xt) may need to be determined by observation. 

Note finally that the effects of shocks may persist over several time periods (as, for example, 
when the store is required to provide ongoing support for the sudden loss of major piece 
of equipment such as a generator), so that each of the functions At —which will in general 
be decreasing—need not be flat for values of its argument in excess of the output rate 
constraint Pot- In particular a reasonable way of dealing with a shock whose effects do 
persist over several time periods may simply be to reserve notionally sufficient energy in 
the store to deal with it; then, following such a shock, the level of the store will temporally 
become the excess over that reserve and the capacity of the store will correspondingly be 
temporally reduced. This causes no problems for the present theory, and is a reason for 
allowing a possible time dependence (which may be dynamic) for the capacity of the store. 
We consider some plausible functional forms of the functions At in Section 8. 

5 The optimal control problem 

We now assume that the functions At defined in Section 2 are known, at least to a suffi¬ 
ciently good approximation—see the discussion of the previous section. 

Dehne (the random variable) s = (sq, • • •, st) (with sq = Sq) to be the levels of the store at 
the end of the successive time periods t = 0,..., T under the (stochastic) optimal control 
as defined in Section 2. Recall also from Section 2 that, for each t and each level Si_i of 
the store at the end of the time period t — 1, the quantity xt{st-i) is the value of xt G Xt 
which achieves the minimisation in the recursion (1). 

For any vector s = (sq, ..., st) and for each t = 1,... ,T, define 

xt{s) = St - St-l- (11) 

Define also the following (deterministic) optimisation problem: 

P: choose s = (sq, ■ ■ ■, st) with sq = Sq so as to minimise 

T 

^2[Ctixt{s)) + At{st)] ( 12 ) 

t=i 

subject to the capacity constraints 

0<st<Et, l<t<T, (13) 

and the rate constraints 

xt{s) G Xt, 1 <t <T. (14) 

Let s* = (sq, ..., s^) denote the solution to the above problem P. It follows from direct 
iteration of the recursion (1), using also (2), that xi(s*) achieves the minimisation in (1) for 
t = 1 and when sq = Sq) 1-®- ®i('5o) = ®i(so) = a^i('S*). Thus, from (11), provided no 

shock occurs during the time period 1 so that si = so + f i(so), we have also that si = sj^. 
More generally, let the random variable T' index the first time period during which a 
shock does occur. Then repeated application of the above argument gives immediately 
the following result. 
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Proposition 4. For all t < T', we have st = s^. 

The solution to the problem P therefore defines the optimal control of the store up to the 
end of the time period T' defined above. At that time, and the end of each subsequent 
time period during which there occurs a shock, it is of course necessary to reformulate 
the problem P, starting at the end of the time period T' (or as soon any shock occurring 
during that time period has been fully dealt with), instead of at time 0, and replacing the 
initial level Sq = sq by the perturbed level st' of the store at that time. Thus the stochastic 
optimal control problem may be solved dynamically by the solution of the problem P at 
time 0, and the further solution of (a reformulated version) of this problem at the end of 
each subsequent time period in which a shock occurs. The solution of the problem, which 
we now consider, is very much simpler than that of the corresponding stochastic dynamic 
programming approach. 

6 Lagrangian theory and characterisation of solution 

We showed in the previous section that, to the extent that the functions At are known, 
an optimal control for the store may be developed via the solution of the optimisation 
problem P defined there. In Section 4 we discussed how to make what are in many cases 
good and readily determinable approximations for the functions At- 

We again assume convexity of the functions At (see Section 3), in addition to that of the 
functions C*. We develop the strong Lagrangian theory [32, 33] associated with the prob¬ 
lem P. This leads to both an efficient algorithm for its solution, and to the identification 
of the Lagrange multipliers necessary for the proper dimensioning of the store. In partic¬ 
ular Theorem 5 establishes the existence of a pair of vectors (s*. A*) such that s* solves 
the problem P and A* is a function of the associated Lagrange multipliers corresponding 
to the capacity constraints (see below); the theorem further gives conditions necessarily 
satisfied by the pair (s*, A*). 

We now introduce the more general problem P(a, b) in which so is kept fixed at the value 

Sq of interest above, but in which si,... ,sr are allowed to vary between quite general 

upper and lower bounds: 

P(a, b): choose s = (sq, • • ., st), with sq = Sq so minimise 

T 

J2[Ct{xt{s)) + At{st)] (15) 

t=i 

subject to the capacity constraints 

at < St <bt, 1 <t <T, (16) 

and the rate constraints 

xt(s) e Xt, 1 <t <T. (17) 

Here a = (ai,..., qt) and b = {bi,..., bx) are such that at < bt for all t. Let also a* and 
b* be the values of a and b corresponding to our particular problem P of interest, i.e. 

a* = 0, b; = Et, l<t<T. (18) 

Note that the convexity of the functions Ct and At guarantees their continuity, and, since 
for each a, b as above the space of allowed values of s is compact, a solution s*{a, b) to 
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the problem P(a, b) always exists. Let V{a, b) be the corresponding minimised value of 
the objective function, i.e. 

T 

b) ='^[Ct{xt{s*{a, b))) + At{s*{a, b))]. 

t=i 

Observe also that the function V{a, b) is itself convex in a and b. To see this, consider any 
convex combination {a,b) = (koi + (1 — K)a 2 ,Kbi + (1 — K)b 2 ) of any two values (ai, 6 i), 

( 02 , ^ 2 ) of the pair (a, b), where 0 < k < 1; since the constraints (16) and (17) are linear, it 
follows that the vector s = Ks*{ai, 61 ) + (1 — K)s*(a 2 , 62 ) is feasible for the problem P(a, 6 ); 
hence, from the convexity of the functions Ct and At, 

T 

V{a,h) < y~][C't(xt(s)) + Atjst)] 

t=i 

T 

= '^[CtiKXtis*{ai,bi)) + (1 - K)xtis*{a 2 ,b 2 ))) + At{KSt{ai,bi) + (1 - k)sj ( 02 , 62 ))] 

i=l 

T T 

- i^'^[Ctixt{s*{ai,bi))) + At{st{ai,bi))] + (1 - k) ^[C't(xt(s*(a 2 , 62 ))) + ^ 4 ( 5 ^( 02 , 62 ))] 

4=1 4=1 

= KV{ai,bi) + (1 - K)V{a2,b2). 

We now have the following result, which encapsulates the relevant strong Lagrangian 
theory. 

Theorem 5. Let s* denote the solution to the problem P. Then there exists a veetor 
X* = (A);,..., A^) such that 

(i) for all vectors s such that sq = Sq and xt{s) G Xt for all t (s is not otherwise 
constrained), 

T T 

Y, [Ct{xt{s)) + ^ 4 ( 54 ) - X*tst] > Y [Ct{xt{s*)) + At{s*t) - XU)]. (19) 

4=1 4=1 

(a) the pair {s*,X*) satisfies the complementary slackness conditions, for 1 <t <T, 

'X) = Q ifQ<s*t<Et, 

< A^* > 0 if s*t = 0 , ( 20 ) 

_Ai *<0 ifs*t=Et. 

Conversely, suppose that there exists a pair of vectors {s*, A*), with sq = Sq, satisfying the 
conditions (i) and (ii) and such that s* is additionally feasible for the problem P. Then 
s* solves the problem P. 

Proof. Consider the general problem P(a, 6 ) defined above. Introduce slack (or surplus) 
variables 2 ; = {zi,... ,zt) and w = {wi, ..., wt) and rewrite P(a, 6 ) as: 

P(a, 6 ): minimise Ylt=i[^t{xt{s)) + ^ 4 ( 54 )] over all s = (sq, ..., st) with sq = Sq, over all 
z >0, over all w > 0, and subject to the further constraints 

St - zt = at, 1 <t <T, (21) 

st + wt = bt, 1 <t <T, (22) 

and also xt{s) G Xt for 1 < t < T. 
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Since the function V{a, b) is also convex in a and b, it follows from the supporting hyper¬ 
plane theorem (see [32] or [33]), that there exist Lagrange multipliers a* = , a^) 

and (5* = {I5\, ..., /3^) such that 

T T 

V(a, b) > V{a\ b*) + ^ al{at - 4) + ^ P^bt - 6*) for all a, b (23) 

t=i t=i 

Thus also, for all s with sq = Sq and such that xt{s) £ Xt ioic 1 < t < T, for all z > 0, and 
for all tc > 0, by defining a and b via (21) and (22), we have 

T 

^ [Ct{xt{s)) + At{st) - al{st - Zt) - I3t{st + Wt)] 

t=i 

T 

> Y, [Ctixtis*]) + At{sl) - aWt - HIK]. (24) 

t=i 

Since the components of z and w may take arbitrary positive values, we obtain at once 
the following complementary slackness conditions for the vectors of Lagrange multipliers 
a* and /?*: 


aj > 0, = 0 whenever > a^, 1 < t < T, (25) 

/3t < 0 , fit — ^ whenever < b^, 1 < t < T. (26) 

Thus, from (24)-(26), by taking zt = wt = 0 for all t on the left side of (24), it follows 

that, for all s with sq = Sq xt{s) G Xt for 1 < t <T, 

T T 

[Q(rc,(»)) + ^,(«,) - (a; + «)»,] > 53 [C,(x,(s*)) + ^,(4) - (a; + A*)4]. (27) 

t=l t=l 

The condition (i) of the theorem now follows on defining 

X* = a: + fif, l<t<T. (28) 

while the condition (ii) follows from (28) on using also the complementary slackness con¬ 
ditions (25) and (26). 

To prove the converse result, suppose that a pair (s*, A*) (with sq = Sq) satisfies the 
conditions (i) and (ii) and that s* is feasible for the problem P. From the condition (ii), 
we may define (unique) vectors a* = (a]],..., a^) and /3* = (/3]|‘,..., fJ^) such that the 
conditions (25), (26) and (28) hold. The condition (i) of the theorem now translates to 
the requirement that, for all vectors s such that sq = Sq and xt{s) £ Xt for all t, the 
relation (27) holds. Finally, it follows from this and from the conditions (25) and (26) 
that, for any vector s which is feasible for the problem P—and so in particular satisfies 
0 < St < Et for all t, 

T T 

Yj + ^t{st)] > Y^ [Ct{xt{s*)) + At{sl)] , (29) 

t=i t=i 

so that s* solves the problem P as required. □ 

Remark 5. Note that the second part of Theorem 5, i.e. the converse result, does not 
require the convexity assumptions on the functions Ct and At. 
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The above Lagrangian theory—which we require for the determination of the optimal 
control as described in Section 7 —further enables a determination of the sensitivity of 
the value of the store with respect to variation of its capacity constraints. For the given 
problem P, the cost of optimally operating the store (the negative of its value) is given by 
V{a*, b*), where we recall that a* and b* are as given by (18). For any t, the derivative 
of this optimised cost with respect to Et, assuming this derivative to exist, is given by the 
Lagrange multiplier j3^ defined in the above proof (the differentiability assumption ensuring 
that /3t is here uniquely defined). Note further that when < Et then (from (26)) the 
Lagrange multiplier is equal to zero, and when = Et then (from (25) and (28)) we 
have ^t — K- 

A further determination of the sensitivity of the value of the store with respect to variation 
of its rate constraint may be developed along the lines of Theorem 5 of Cruise et al [5], 
but we do not pursue this here. 


7 Determination of (s*, A*) 

The structure of the objective function causes some difficulties for the solution of the 
problem P. As previously observed, a dynamic programming approach might seem natural 
but, even for this deterministic problem, typically remains too computationally complex— 
on account of both the likely time-heterogeneity of the functions At and Ct, and of the 
need, even for small t, to consider the problem over the entire time interval [0, T]. 

We continue to assume convexity of the functions Ct and At. Under the further assumption 
of differentiability of the functions At, we give an efficient algorithm for the construction 
of a pair (s*. A*) satisfying the conditions of Theorem 5 —so that, in particular, s* solves 
the problem P. This algorithm is further sequential and local in time, in the sense that 
the determination of the solution to any given time t' < T typically requires only the 
consideration of the problem, i.e. a knowledge of the functions Ct and At, for those times t 
extending to some time horizon which is typically only a short distance beyond t'. We 
have already shown in Section 5 that the ability to dynamically solve the deterministic 
problem P, or updates of this problem, at the times of successive shocks enables an efficient 
(stochastically) optimal control of the store. 

We give conditions necessarily satisfied by the pair (s*. A*). Under the further assumption 
of strict convexity of the functions Ct, we show how these conditions may be used to 
determine (s*,A*) uniquely. We then indicate how the strict/ convexity assumption may 
be relaxed. 

Proposition 6. Suppose that the functions At are differentiable, and that the pair {s*, A*) 
is such that s* is feasible for the problem P, while {s*,X*) satisfies the condition (ii) of 
Theorem 5. For each t define 

T 

U=t 

Then the condition that (s*,A*) satisfies the condition (i) of Theorem 5 is equivalent to 
the condition that 

xt{s*) minimises Cfix) — vfx in x G Xt, I <t <T. (31) 
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Proof. Assume that the pair (s*,A*) is as given. Suppose hrst that additionally (s*,A*) 
satishes the condition (i) of Theorem 5. The condition (31) is then straightforward when 
the functions Ct are additionally differentiable: for each t the partial derivative of the left 
side of (19) with respect to xt{s) (with Xu{s) being kept constant for tt / t) is necessarily 
zero at s = s*, so that (31) follows from the assnmed convexity of the fnnctions Ct- For 
the general case, note that it follows from the condition (i) of Theorem 5 (by considering 
s such that sq = Sg , xt{s) = xt{s*) + h, Xu(s) = Xu(s*) for u / t), that, for all t and for 
all real h, 

T 

Ct{xt{s*) + h) + J^K(4 + h)- A>] (32) 

U=t 

is minimised at h = 0, and so, for all (small) h, 

Ct{xt{s*) + h) - vthyCtixtis*)) + o{h), as h0. (33) 

Thns (31) again follows from the assnmed convexity of the functions Ct- 

To prove the converse result, suppose now that (s*,A*) satisfies the condition (31). This 
condition, together with the convexity and differentiability of the functions At, then 
implies that, for all t, the expression (32) is minimised at /i = 0. It is now straight¬ 
forward that the hyperplane in whose vector of slopes is A* supports the function 
[C't(xi(s)) At{st)] at the point {s*, YA=i[Ct{xt{s*)) + At{st)]), so that finally the 
condition (i) of Theorem 5 holds as required. □ 

It now follows from Theorem 5 and Proposition 6 that if the pair (s*, A*) is such that s* 
is feasible for the problem P, and that (s*,A*) satisfies the both condition (31) and the 
condition (ii) of Theorem 5, then s* further solves the problem P. 

We now show how to construct such a pair (s*,A*). We assume, for the moment, strict 
convexity of the functions Ct', we subsequently indicate how to relax this assnmption. It 
follows from the assumed strict convexity that, for each t and for each ut, there is a unique 
x G Xt, which we denote by xj'(r'i), which minimises Ct{x) — vtx in Xt. Further xj'(r'i) is 
continuous and increasing in ut —strictly so for vt snch that Xt^ot) lies in the interior of 
Xt- In particular, from (11), the condition (31) may now be rewritten as 

s*t=sU+4i’^:), l<t<T. (34) 

It further follows from (30) that 

z.;+i = z.; + A((sn-A:. i<t<r-i. ( 35 ) 

Thus, were the vector A* known, together with the value of the constant the pair 
(s*, C) could be constructed sequentially via (34) and (35). We observe that, while 
A* is not known, it does satisfy the conditions (20) and in particular the requirement 
that A* = 0 for all t such that 0 < < Et- We now follow a procedure which is a 

generalisation of one described by Cruise et al [5], and which involves an essentially one¬ 
dimensional search so as to identify the constant This search, which may be thonght 
of as being carried out at time zero and which is not computationally intensive (see the 
fnrther remarks at the end of this section), then needs to be repeated at each of a number 
of subsequent times as described below. We show how to define inductively a sequence of 
times 0 = To < Ti < • • • < Tfc = T such that s*{Ti) = 0 or s*{Ti) = Et^ fox 1 < i < k and 
snch that A^ = 0 for all values of t not in the above sequence. 
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The time Ti is chosen as follows. Consider trial values z/i of uf. For each such define 
a pair of vectors = (i^i,..., i^t) and s = (si,..., st) by 

St = st-i + x*t{ut), 1 < t < T, (36) 

ut+i = vt +A^{st), l<t<T-l. (37) 

Define M and M' to be the sets of values of vi for which the vector s defined via (36) 
and (37) violates one of the capacity constraints (13) and first does so respectively below 
or above—in either case at a time which we denote by Ti[ui). Since, for each t, Xj(z-'t) is 
increasing in ut and A[(st) is increasing in st (by the convexity of At), it follows that if 
ui £ M then € M for all < ui and that if I'l G M' then G M' for all v'l > z^i; 
further the sets M and M' are disjoint, and (since the solution set for the problem P is 
nonempty) neither the set M nor the set M' can be the entire real line. Let Vi = supM. 
(In the extreme case where M is empty we may set = — oo). We now consider the 
behaviour of the corresponding vector s defined via (36) and (37) where we take ui = i'l', 
for this vector s there are three possibilities: 

(a) the quantity ii belongs neither to the set M nor to the set M', i.e. the vector s 
generated as above is feasible for the problem P; in this case we take Ti = T and 
s* = s with I'l = i'l and = 0 for 1 < t < T — 1 (so that the remaining values of v* 
are given by (35)); 

(b) the quantity ii belongs to the set M; in this case there exists at least one t < Ti^ii) 
such that St = Et (were this not so then, by the continuity of each xl{vt) in vt, the 
value of vi could be increased above i'l while remaining within the set M); define Ti 
to be any such t, say the largest, and take = s* for 1 < t < Ti with vl = i'l and 
Ai* = 0 for 1 < t < Ti - 1; 

(c) the quantity ii belongs to the set M'; in this case, similarly to the case (b), there 
exists at least one t < Ti{ii) such that st = 0; define Ti to be any such t, again 
say the largest, and again take = st for 1 < t < Ti with vl = ii and A^ = 0 for 
1 < t < Ti - 1. 

In each of the cases (b) and (c) above, we now repeat the above procedure, starting at the 
time Ti instead of the time 0, and considering trial values of thereby identifying 

time T 2 and the values of for Ti + I < f < r 2 , and taking A^ = 0 for 
Ti + I < t < T 2 — I. The quantity A^^ is now defined via (35). Further consideration of 
the sets M and M' defined above in relation to the identification of i'l = ii shows easily 
that in the case ii G M —so that = Eti —the quantity ~ ^Ti+i is necessarily 

such that A^^ > 0 (since in this case, by the above construction, the quantity has 

a value which is necessarily at least as great as would have been the case had A^^ been 
equal to 0), whereas in the case ii G M '—so that = 0—the quantity = i^Ti+i is 

necessarily such that A^^ < 0. 

For T 2 ^ T we continue in this manner until the entire sequence 0 = Tq < Ti < • • • < 
Tfc = T is identified. We thus obtain vectors s*. A* and i'* such that s* is feasible for the 
problem P, while (s*, A*) satisfies the condition (31) and the condition (ii) of Theorem 5 
and so solves the problem P as required. 

In the case where, for at least some t, the cost function Ct is convex, but not necessarily 
strictly so, some extra care is required. Here, for such t, the function v —?• xl{v) is not 
in general uniquely defined; further, for any given choice, this function is not in general 
continuous. However, the above construction of (s*,A*) continues to hold provided that, 
where necessary, we choose the right value of xl{v). The latter may always be identified 
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by considering, for example, a sequence of strictly convex functions converging to Ct 
and identifying (z^) as the limit of its corresponding values within this sequence. 

Note that the above construction proceeds locally in time, in the sense that, at each 
successive time Tj, the determination of the subsequent time Tj+i and of the values of 
s'l and for Tj + 1 < t < Tj+i only requires consideration of the functions Ct and 
At up to some time Tj+i (necessarily beyond T^+l) the identification of which does not 
depend on the functions Ct and At at any subsequent times. More precisely we have 
Ti = where Ti(Pi) is as identified above, and the remaining Tt, 2 < i < k, are 

similarly identified. In particular we have that, for each time t and given the optimal 
choice of store level sj' depends only on the functions Cf and Af for t < t' <T{t) where 
we define T(t) = Tj+i for i such that Ti + 1 < t < Tj+i. The function T{t) is piecewise 
constant in t, and so the time horizon or look-ahead time T{t) —t required for the optimal 
decision at each time t has the “sawtooth” shape which we illustrate in our examples of 
Section 8. 

Note further that a lengthening of the total time T over which the optimization is to be 
performed does not in general change the values of the times Tj, but rather simply creates 
more of them. In particular the solution to the problem P involves computation which 
grows essentially linearly in T, and the algorithm is suitable for the management of a store 
with an infinite time horizon. 

The typical length of the intervals between the successive times Tj depends on the shape 
of the functions Ct and At and in particular on the rate at which they fluctuate in time. 
Thus, for example, the long-run management of a store for which the functions Ct show 
strong daily fluctuations typically involves decision making on a running time horizon of 
the order of a day or so. 

Finally note that, as already indicated, in the implementation of the above construction, 
some form of one-dimensional search is usually required to determine each of the successive 
hTi+i'- each trial value of this quantity provides either an upper or lower bound to the true 
value, so that, for example, a simple binary search is sufficient. Given also the “locality” 
property referred to above, the numerical effort involved in the implementation of the 
above algorithm is usually very slight. 

8 Examples 

We give some examples, in which we solve (exactly) the optimal control problem P for¬ 
mally defined in Section 5. We investigate how the optimal solution depends on the cost 
functions Ct defined there which reflecting buying and selling costs and hence the oppor¬ 
tunity to make money from price arbitrage, and on the functions At which reflect the costs 
of providing buffering services. 

The cost functions Ct are derived from half-hourly electricity prices in the Great Britain 
spot market over the entire year 2011, adjusted for a modest degree of market impact, as 
described in detail below. Thus we work in half-hour time units, with the time horizon T 
corresponding to the number of half-hour periods in the entire year. These spot market 
prices show a strong daily cyclical behaviour (corresponding to daily demand variation), 
being low at night and high during the day. This price variation can be seen in Figure 1 
which shows half-hourly GB spot prices (in pounds per megawatt-hour) throughout the 
month of March 2011. There is a similar patter of variation throughout the rest of the 
year. 
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Spot Prices, March 2011 


100 - 


D. 

o 

Q. 

(f) 



1 March 8 March 15 March 22 March 29 March 


Figure 1: GB half-hourly spots prices (£/MWh) for March 2011. 


Without loss of generality, we choose energy units such that the rate (power) constraints 
are given by Pu = Pot = 1 unit of energy per half-hour period. For illustration, we 
take the capacity of the store to be given by Fi = 10 units of energy; thus the store can 
completely fill or empty over a 5-hour period, which is the case, for example, for the large 
Dinorwig pumped storage facility in Snowdonia [34]. 

We choose cost functions Ct of the form 


Ct{x) 


ctx{l + 5x), 
rjctx{l + 5x), 


if X > 0 
if X < 0, 


(38) 


where the ct are proportional to the half-hourly electricity spot prices referred to above, 
where rj is an adjustment to selling prices representing in particular round-trip efficiency as 
described in Section 2, and where the factor (5 > 0 is chosen so as to represent a degree of 
market impact (higher unit prices as the store buys more and lower unit prices as the store 
sells more). For our numerical examples we take rj = 0.85 which is a typical round-trip 
efficiency for a pumped-storage facility such as Dinorwig. We choose 6 = 0.05; since the 
rate constraints for the store are P/t = Pot = 1 this corresponds to a maximum market 
impact of 5%. While this is modest, our results are qualitatively little affected as 6 is 
varied over a wide range of values less than one, covering therefore the range of possible 
market impact likely to be seen for storage in practice. 

Finally we need to choose the functions At reflecting the costs of providing buffering 
services. Our aim here is to give an understanding of how the optimal control of the 
store varies according to the relative economic importance of cost arbitrage and buffering, 
i.e. according to the relative size of the functions Ct and At. We choose functions At 
which are constant over time t and of the form ^t(s) = ae~'^^ and At{s) = b/s for a 
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small selection of the parameters a, k and b. The extent to which a store might provide 
buffering services in applications is extremely varied, and so the likely balance between 
arbitrage and buffering cannot be specified in advance. Rather we choose just sufficient 
values of the above parameters to show the effect of varying this balance. For a possible 
justification of the chosen forms of the functions At (including why it should not necessarily 
be truncated to 0 for values of s greater than the rate constraint of 1), see Section 4; in 
particular the form At{s) = ae~'^^ is plausible in the case of light-tailed shocks, while the 
form = h/s shows the effect of a slow rate of decay in s. 

In each of our examples, we determine the optimal control of the store over the entire 
year, with both the initial level Sq and the final level given by Sg = = 0. In each 

of the corresponding figures, the upper panel shows the optimally controlled level of the 
store throughout the month of March. The lower panel shows, for each time t in the same 
month, the time horizon (or look-ahead time) T(t) — t, defined in Section 7, i.e. the length 
of time beyond the time t for which knowledge of the cost functions is required in order 
to make the optimal decision at time t. 

Figure 2 shows the optimal control of the store when the functions At are given by At{s) = 
ae~'^^. The uppermost panels correspond to o = 0, so that the store incurs no penalty for 
failing to provide buffering services and optimises its control solely on the basis of arbitrage 
between energy prices at different times. The daily cycle of prices is sufficiently pronounced 
that here the store fills and empties—or nearly so—on a daily basis, notwithstanding the 
facts that the round-trip efficiency of 0.85 is considerably less than 1 and that the minimum 
time for the store to fill or empty is 5 hours. It will be seen also that the time horizon, or 
look-ahead time, required for the determination of optimal decisions is in general of the 
order of one or two days. 

The central panels of Figure 2 correspond to k = 1 and a = 1. The choice of a in particular 
is such that the store is just sufficiently incentivised by the need to reduce buffering costs 
that it rarely empties completely (though it does so very occasionally). Otherwise the 
behaviour of the store is very similar to that in the case a = 0. Note also that in this 
case the time horizons or look-ahead times are in general somewhat longer; an intuitive 
explanation (backed by a careful examination of the figure) is that, starting from a time 
when the store is full, the determination of by how much the store should avoid emptying 
completely requires taking account of the cost functions for a longer period of future time 
than is the case where the store does empty completely. 

Finally the bottom two panels of Figure 2 correspond to k = 1 and a = 10. Here the costs 
of failing to provide buffering services are much higher, and so the optimised level of the 
store rarely falls below 25% of its capacity. Curiously the look-ahead times are in general 
less than in the case a = 1—presumably since the store level is more often reaching the 
capacity constraint. 

Variation of the exponential parameter k does not result in dramatically different be¬ 
haviour, so we do not pursue this here. 

Figure 3 shows the optimal control of the store when the functions At are given, for each t, 
by At{s) = b/s. The upper panels correspond to 6 = 0, so that we again have At{s) = 0 
for all s and the control is as observed previously. The lower panels correspond to the case 
6 = 1, and, as might be expected, the behaviour here is somewhat intermediate between 
that for the two nonzero exponentially decaying exponential functions. 
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Figure 2: Store level and time horizon throughout March 2011 for the example with 
At{s) = The top panels correspond to o = 0, the central panels to a = 1, k = 1, 

and the bottom panels toa = 10, k = 1. 
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Figure 3: Store level and time horizon throughout March 2011 for the example with 
At{s) = b/s. The upper panels correspond to ^t(s) = 0 and the lower panels to A{t) = 1/s. 
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